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(57) Abstract 

The invention provides a m^od and 
system for duplicating all or part of a file 
system while maintaining consistent copies of 
the file system. The file server maintains a set 
of snapshots, each indicating a set of storage 
blocks making up a consistent cc^y of the 
file system as it was at a known tiine. Eadi 
snapshot can be used for a purpose other than 
maintaining the cohoency of fiie file system, 
such as duplicating or transferring a backup 
copy of the file system to a destinadon storage 
medhim. In a prefened embodiment, 
snapshots can be manipulated to identify sets of 
storage blocks in the file system for Incremental 
backup or copying, or to provide a file syst«»n 
backup tfiat is both complete and relatively 
inexpensive. 
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Title of the Invention 
File Syst^. Image Transfer 
5 Background of the Invention 

/. Field of the Invention 

Tlie mvention relates to storage systems. 

10 

2. RelatedArt 

In con^uter file systems for storing and retrieving information, it is 
sometimes advantageous to duplicate all or part of the file system. For exan^le, one 
IS purpose for duplicating a file system is tc» maintain a backup copy of the file system to 
protect against lost information. AnotfacT purpose for duplicating a file system is to 
provide replicas of the data in diat file system available at multiple servers, to be able to 
share load incurred in accessing that data. 

20 One problem in the known art is that known techniques for duplicatmg data 

in a file system either are relatively awkward and slow (such as duplication to tape), or 
are relatively expensive (such as diq)lication to an additional set of disk drives). For 
example, known techniques for duplication to tape rely on logical operations of the file 
system and the logical format of the file systm. Beii^ relatively cumbersome and slow 

25 discourages firequent use, resulting in baclaip copies that are relatively stale. When data 
is lost, the most recent backup copy might then be a day old, or several days old, severely 
reducmg the value of the backup copy. 

Similarly, known techniques for duplication to an additional set of disk 
30 drives rely on the physical format of the file system as stored on the origkial set of disk 
drives. These known techniques use an additional set of disk drives for duplication of the 
entue file system. Bemg relatively expensive discourages use, particularly for large file 
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systems. Also, relying on the physical format of the file system complicates operations 
for restoring backup data and for performing incremental backup. 

Accordingly, it would be desirable to provide a method and system for 
5 duplicating all or part of a file system, which can operate with any type of storage 
medium without either relative complexity or expense, and >^ch can provide all the 
known functions for data badaq> and restore. This advantage is achieved in an 
embodiment of the invoition in vMch consistent copies of the file system are 
maintained, so Aose consistrat snapshots can be transferred at a storage blodc level using 
10 the file server's own blodc level operations. 

Sunmiarv of the Invention 

The invention provides a method and system for diq^licating all or part of a 
15 file system while maintaining consistent copies of the file system. The file server 
maintains a set of snapshots, each indicating a set of storage blocks making up a 
consistent copy of tfie file system as it was at a known time. Each snapshot can be used 
for a purpose other than maintaining the coherency of the file system, such as duplicating 
or transferring a backiq> copy of the file system to a destination storage medium. In a 
20 preferred embodiment, the snapshots can be manipulated to identify sets of storage 
blocks in the file system for incremental backup or copying, or to provide a file system 
backup that is both complete and relatively intensive. 



25 



Brief Description of the Drawings 
Figure I shows a block diagram of a first system for file system image 

transfer. 



Figure 2 shows a block diagram of a set of snapshots in a system for file 
30 system image transfer. 



2 
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Figure 3 shows a process flow diagram of a method for file system image 

transfer. 

Detailed Description of the Preferred Embodiment 

In the following description, a preferred embodiment of the invention is 
described with regard to preferred process steps and data structures. However, those 
sidlied in the art wo\ild recognize, after perusal of this application, that embodiments of 
the invention may be implemented using one or more general purpose processors (or 
special purpose processors adapted to the particular process steps and data structures) 
operating under program control, and that implementation of the preferred process steps 
and data structures described herein using such equipment would not require undue 
experimentation or further invention. 

Inventions described herein can be used in conjunction with inventions 
described in the following applications: 

0 Application Serial No. 08/471,218, filed June 5, 1995, in the name of mventors 
David Hitz et al., titled **A Mettiod for Providing Parity in a Raid Sub-System 
Using Non-Volatile Memory*', attorney docket number NET-004; 

0 Application Serial No. 08/454,921, filed May 31, 1995, in the name of inventors 
David Hitz et al., titled "Write Anywhere File-System Layout**, attorney docket 
number NET-005 ; 

0 Application Serial No. 08/464,591, filed May 31, 1995, in the name of inventors 
David Hitz et al., titled "Method for Allocating Files in a File System Integrated 
with a Raid Disk Sub-System**, attorney docket number NET-006. 

Each of these applications is hereby incorporated by reference as if fiilly set 
forth herein. They are collectively referred to as the "WAFL Disclosures.** 
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File Servers and File System Image Transfer 

Figure 1 shows a block diagram of a system for file system image transfer. 

5 A system 100 for file system image transfer includes a file server 110 and a 

destination file system 120. 

The file server 110 includes a processor 111, a set of program and data 
memoiy 112, and mass storage 113» and preferably includes a file server 110 like one 
10 described in the WAFL Disclosures. In a preferred embodiment, the mass storage 1 13 
includes a RAID storage subsystem. 

The destination file system 120 includes mass storage, such as a flash 
memory, a magnetic or optical disk drive, a tape drive, or other storage device. In a 
15 preferred embodiment, the destination file system 120 includes a RAID storage 
subsystem. The destination file system 1 20 can be coupled directly or indirectly to the 
file server 1 10 using a communication patli 130. 

In a first preferred embodiment, the destination file system 120 is coupled 
20 to the file server 110 and controlled by the processor 111 similarly to the mass storage 
1 13. In this first prefened embodim^t, the conununicadon path 130 includes an internal 
bus for the file server 1 10, such as an I/O bus, a mezzanine bus, or other system bus. 

In a second preferred embodunent, the destination file system 120 is 

25 included in a second file server 1 10. The second file server 110, similar to the first file 

server 110, includes a processor 111, a set of program and data memory 112, and mass 

storage 113 that serves as the destination file system 120 writh regard to the first file 

server 1 10. The second file server 1 10 also prefc^ly includes a file server 110 like one 

described in the WAFL Disclosures. In this second preferred embodiment, the 

30 communication path 130 includes a network path between the first file server 110 and the 

second file server 110, such as a direct communication link, a LAN (local area network), 

a WAN (wide area network), a NUMA network, or another interconnect. 

4 



wo 00/07104 



PCT/US99/17148 



In a third preferred embodiment, the communication path 130 includes an 
intermediate storage medium, such as a tape, and the destination file system 120 can be 
either the first file server 110 itself or a second file server 110. As shown below, when 
the file server 110 selects a set of storage blocks for transfer to the destination file system 
5 120, that set of storage blocks can be transferred by storing them onto the intermediate 
storage medium. At a later time, retrieving that set of storage blocks firom the 
intermediate storage medium completes the transfer. 

It is an aspect of the invention that there are no particular restrictions on the 
10 communication path 130. For example, a first part of the conununication path 130 can 
include a relatively high-speed transfer link, while a second part of the communication 
path 130 can include an intermediate storage medium. 

It is a fiuther aspect of the invention that the destination file system 120 
15 can be included in the first file server 110, in a second file server 110, or distributed 
among a plurality of file servers 1 10. Transfer of storage blocks firom the first file server 
110 to the destination file system 120 is thus con^>letely general, and includes the 
possibility of a wide variety of different file system operations: 

20 o Storage blocks fix>m the first file server 110 can be dumped to an intermediate 
storage medium, such as a tape or a second disk drive, retained for a period of 
time, and then restored to the first file server 110. Thus, the first file serv^ 110 
can itself be the destination file system. 

25 o Storage blocks firom the first file server 1 10 can be transferred to a second file 
s^er 1 10, and used at that second file sorer 1 10. Thus, the storage blocks can 
be copied en masse firom tiie first file server 1 10 to the second file server 1 10. 

o Storage blocks firom the first file server 110 can be distributed using a plurality of 
30 different communication paths 130, so that some of the storage blocks are 

immediately accessible while others are recorded m a relatively slow intermediate 
storage medium, such as tape. 

5 
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o Storage blocks from tiie first file server 110 can be selected from a complete file 
system, transferred using the communication path 130, and then processed to fonn 
a complete file system at the destination file system 120. 

5 In alternative embodiments described herein, the second file server 1 10 can 

have a second destination file system 120. That second destination file system 120 can 
be included within the second file server 1 10, or can be included within a third file server 
1 10 similar to the first file server 1 10 or the second file server 1 10. 

10 More gmmlly, each n file server 110 can have a destmation file system 

120, either included within the n* file server 1 10, or included witfiin an n+1^^ file server 
110. The set of file s^«s 1 10 can thus form a directed graph, preferably a tree with the 
first file serv^ 110 as the root of that tree. 

15 File System Storage Blocks 

As described in the WAFL Disclosures, a file system 1 14 on ttie file server 
110 (and in geno-al, on the n*^ file s^er 1 10), includes a set of storage blocks 115, each 
of which is stored either in the memory 1 12 or on the mass storage 1 13. The file system 
20 1 14 includes a current block map, which records which storage blodcs 1 15 are part of the 
file system 1 14 and whidi storage blocks 1 15 are fi^. 

As described in tibe WAFL Disclosures, the file system on the mass storage 
1 13 is at ail times consistent. Thus, the storage blocks 1 15 included in the file system at 
25 all times comprise a consistent file system 1 14. 

As used herein, the term ''consistent,'' referring to a file system (or to 
storage blocks in a file system), means a set of storage blocks for that file system that 
includes all blocks required for tfie data and file structure of that file syst^. Thus, a 
30 consists! file system stands on its own and can be used to idratify a state of the file 
system at some point in time that is both complete and self-consistent. 



6 
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As described in the WAFL Disclosures, when changes to the file system 
114 are committed to the mass storage 113, the block map is alt^ed to show those 
storage blocks 115 tiiat are part of the committed file system 114. In a preferred 
embodiment, the file server 110 updates the file system fi:equently, such as about once 
5 each 10 seconds. 

Snapshots 

Figure 2 shows a block diagram of a set of sni^shots in a system for file 
10 system image transfer. 

As used herein, a '"snapshof ' is a set of storage blocks, the member storage 
blocks forming a consistent file system, disposed using a data structure that allows for 
efficient set management The efGcient set management can include time efficiency for 

15 set op«:ations (such as logical sum, logical difference, membership, add member, remove 
member). For example, the time efficiency can include 0(n) time or less for n storage 
blocks. The efficimt set management can also include space effidency for enumerating 
the set (such as association with physical location on mass stor^ or inverting the 
membership function). The space efficiency can mean about 4 bytes or less per 4K 

20 storage block of disk space, a ratio about 1000:1 better tfian duplicating the storage 
space. 

As described hopein, the data structure for the snsqpi^ot is stored in the file 
system so Hbsco is no need to travme the file systrai tree to recover it In a preferred 
25 embodiment, each snapshot is stored as a file system object, sudi as a blocionap. The 
blockmap includes a bit plane having one bit for eadi storage block, oth^ than bits used 
to identify if the storage block is in the active file system. 

MoreovCT, v^en the file system is backed-up, restored, or otherwise copied 
30 or transferred, the blockmap within the fde system is as part of flie same operation itself 
also backed-up, restored, or otherwise copied or transferred. Thus, operations on the file 
system inherently include preserving snapshots. 

7 
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Any particular snapshot can be transferred by any communication 
technique, including 

o transfer using storage in an intermediate storage medium (such as nonvolatile 
5 memory, tape, disk in the same file system, disk in a different file system, or disk 

distributed over several file systems); 

o transfer using one or more network messages, 

10 o transfer using conununication within a single file server or set of file servers (such 
as for stcnrage to disk in the same file systrani, to disk in a different file system, or 
to disk distributed ov^ several file systems). 

A collection 200 of sn^shots 210 mcludes one bit plane for each snapshot 
ts 210. Each bit plane indicates a set of selected storage blocks IIS. In the figure, each 
column mdicates one bit plane (that is, one snapshot 210), and each row indicates one 
storage block 115 (that is, the history of tiiat storage block 115 being included in or 
excluded fix>m successive snapshots 210). At the intersection of each column and each 
row tiiere is a bit 21 1 indicating whether that particular storage block 1 15 is included in 
20 that particular snapshot 210. 

Each snapshot 210 ccnninrises a collection of selected storage blocks US 
from the file system 114 that formed all or part of the (consistent) file system 114 at 
some point in tune. A snapshot 210 can be areated m response to the block map at any 
25 time by copying the bits from the block map indicating which storage blocks 1 1 5 are part 
of the file system 1 14 into the corresponding bits 21 1 for tiie snapshot 210. 

Differences between the snapshots 210 and the (active) file system 114 
include the following: 

30 
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10 



The file system 1 14 is a consistent file system 1 14 that is being used and perhaps 
modified, while the snapshots 210 represent copies of the file system 114 that are 
read-only. 

The file system 114 is updated frequently, while the snq)shots 210 represent 
copies of the file system 1 14 tiiat are from the relatively distant past. 

There is (Hily one active file system 1 14, while there can be (and typically are) 
multiple snapshots 210. 



At selected times, the file server 1 10 creates a new bit plane, in response to 
the block map, to create a new snapshot 210. As described herein, snapshots 210 are 
used for backiq[> and xmrroring of the file systrai 114, so in preferred embodiments, new 
snapshots 210 are created at periodic times, such as once per hour, day, week, month, or 
IS as otherwise directed by an operator of the file server 1 10. 

Storage Images and Image Streams 



As used herein a '"storage image'' includes an indicator of a set of storage 
20 blocks selected in re^onse to one or more snapshots. The technique for selection can 
include logical operations on sets (sudi as pairs) of snapshots. In a preferred 
embodiment, these logical operations can include logical sum and logical difference. 



As used herein, an "'image stream'' includes a sequence of storage blocks 
25 from a storage image. A set of associated block locations for those storage blocks from 
the storage image can be identified in the image stream either explicitly or implicitly. 
For a first example, the set of associated block locations can be identified e:q)licitly by 
including volume block numbers witiiin the image stream. For a second example, the set 
of associated block locations can be identified implicitly by the order in which the 
30 storage blocks from the storage image are positioned or transferred within the image 
stream. 
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TTie sequence of storage blocks within the image stream can be optimized 
for a file system operation. For example, the sequence of storage blocks within the 
unage stream can be optimized for a backup or restore file system operation. 

In a preferred embodiment, the sequence of storage blocks is optimized so 
that copying of an image stream and transfer of that image stream fix>m one file server to 
another is optimized. In particular, the sequence of storage blocks is selected so that 
storage blocks identified in the image stream can be, as much as possible, copied in 
parallel &om a plurality of disks m a RAID file storage system* so as to maximize the 
transfer bandwidth icom the first file server. 

A storage image 220 comprises a set of storage blocks 115 to be copied 
firom the file system 1 14 to the destination file system 120. 

The storage blocks 1 IS in the storage image 220 are selected so that when 
copied, they can be combined to form a new consistent file system 1 14 on the destination 
file system 120. In various preferred embodiments, ttie storage image 220 that is copied 
can be combined with storage blocks US fix)m other storage images 220 (which were 
transferred at earlier times). 

As shown herein, the file server 110 creates each storage image 220 in 
response to one or more snapshots 210 

An image stream 230 comprises a sequence of storage blocks 115 from a 
storage image 220. When the storage image 220 is copied firom the file system 114, the 
storage blocks 1 15 are ordered into the image stream 230 and tagged with block location 
information. When the image stream 230 is received at the destination file system 120, 
the storage blocks 115 in the image stream 230 are copied onto the destination fde 
system 120 in response to the block location information. 



10 
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Image Addition and Subtraction 

The system 100 manipulates the bits 21 1 ia a selected set of storage images 
5 220 to select sets of storage blocks 1 1 S, and thus form a new storage image 220. 

For example, the foUowii^ different types of manipulation are possible: 

o The system 100 can form a logical sum of two storage images 220 A + B by 
10 formmg a set of bits 211 each of which is the logical OR (A v B) of ttie 

corresponding bits 211 in the two storage images 220. The logical sum of two 
storage images 220 A + B is tiie union of those two storage images 220. 

o The system 100 can form a logical difference of two storage images 220 A - B by 
IS forming a set of bits 21 1 each of which is logical ^r' only if the corresponding bit 

211 A is logical "^r' and the corresponding bit 211 B b logical '"0"' in the two 
storage images 220. 

The logical sum of two storage images 220 A B comprises a storage 
20 unage 220 diat includes storage blocks 1 IS in either of the two original storage images 
220. Using die logical sum, the system 100 can determine not just a single past state of 
the file system 114, but also a history of past states of that file system 114 that were 
recorded as snapshots 210. 

25 The logical differodce of two selected storage images 220 A - B comprises 

just those storage blocks that are included in the storage image 220 A but not in the 
storage image 220 B. (To preserve integrity of incremental storage images, the 
subtrahend storage image 220 B is always a snapshot 210.) A logical difference is useful 
for determining a storage image 220 having a set of storage blocks forming an 

30 incremental image, which can be used in combination with full images. 
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In alternative embodiments, other and further types of manipulation may 
also be useful. For example, it may be useful to determine a logical intersection of 
snapshots 210, so as to determine which storage blocks 1 IS were not changed between 
those snapshots 210. 

5 

In fiirtiier alternative embodiments, the system 100 may also use the bits 
211 from each snapshot 210 for other purposes, such as to perfomi othor operations on 
the stors^e blocks 115 represented by those bits 21L 

10 Incremental Storage Images 

As used herein, an '^incremental storage image'* is a logical difference 
between a first storage image and a second storage image. 

IS As used herein, in the logical diff^nce A - B, the storage image 220 A is 

called the 'top'' storage image 220, and the storage image 220 B is called the ''base" 
storage image 220. 

When the base storage image 220 B comprises a full set F of storage blocks 
20 lis in a consistent file system 114, die logical difference A - B includes those 
incremental changes to the file system 1 14 between tfie base storage image 220 B and the 
top storage image 220 A. 

Each incremental storage image 220 has a top storage image 220 and a base 
25 storage image 220. Inoremental storage images 220 can be diained together when there 
is a sequence of storage images 220 Ci where a base storage image 220 for each Ci is a 
top storage image 220 for a next Ci4.i. 

/ / / 
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Examples of Incremental Images 

For a first example, the systrai 100 can make a snapshot 210 each day, and 
form a leveI-0 storage image 220 in response to the logical sum of daily snapshots 210. 

June3.1evelO = June3 + June2 + Junel 

(June3, June2, and Junel are snapshots 220 taken on those respective dates.) 

The June3.1evelO storage image 220 includes all storage blocks 1 IS in the 
daily snapshots 210 June3, June2, and Junel. Accordingly, the June3.1evelO storage 
image 220 mcludes all storage blocks US in a consistent file system 114 (as well as 
possibly odier storage blocks 1 IS tiiat are uimecessary for the consistent file system 1 14 
active at the time of die June3 snapshot 210). 

In the first exa]iq>le, die system 100 can form an (mcremental) level-1 
storage image 220 in response to the logical sum of daily snapshots 210 and die logical 
difference with a single sns^shot 210. 

Junes Jevell = JuneS + June4 - June3 

(JuneS, June4 and June3 are snapshots 220 takm on diose respective dates.) 

It is not required to subtract the June2 and Junel snapshots 210 wh» 
formmg the JuneS Jevell storage image 220. All storage blocks IIS that the JuneS 
snapshot 210 and the June4 snapshot 210 have in common with either the June2 snapshot 
210 or die Junel snapshot 210, diey wtU necessarily have in common with the June3 
snapshot 210. This is because any storage block 1 15 that was part of the file system 1 14 
on June2 or Junel, and is still part of the file system 1 14 on JuneS or June4, must have 
also been part of the file system 1 14 on June3. 

In the first example, the system 100 can form an (uicremmtal) level-2 
storage image 220 in response to the logical sum of daily sn2q)shots 210 and the logical 
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difference with a single snapshot 210 fiom the time of the level- 1 base storage image 
220. 



10 



15 



June7.1evel2 = June? + June6 - JuneS 

(June?, June6, and JuneS are snapshots 210 taken on those respective dates.) 

In the first example, the storage images 220 Jime3.1evel0, JuneS .levell, and 
June7.1evel2 collectively include all storage blocks 11 S needed to construct a full set F of 
storage blocks 1 IS in a consistent file system 1 14. 

For a second example, the system 100 can form a different (incremental) 
level-1 storage image 220 in response to the logical sum of daily snapshots 210 and the 
logical difference with a single snapshot 210 from die time of the level-0 storage image 
220. 



June9.1evell = June9 + JuneS - June3 

(June9, JuneS, and JuneS are snapshots 210 taken on those respective dates.) 



Similar to the first example, the storage images 220 June3.1evel0 and 
20 June9.1evell collectively include all storage blocks 1 IS needed to construct a full set F of 
storage blocks 1 IS in a consistent file system 1 14. There is no particular requiremmt 
that the June9.levell storage image 220 be related to or used in conjunction with the 
June7.level2 storage image 220 in any way. 

25 File System Image Transfer Techniques 

To perform one of these copying operations, tfie file server 1 10 includes 
operating system or application software for ccmtrolling the processor 111, and data paths 
for transferring data from the mass storage 1 13 to the communication path 130 to the 
30 destination file system 120. However, the selected storage blocks I IS in the image 
stream 230 are copied from the file Systran 114 to the corresponduig destination file 
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system 120 without logical file system processing by the file system 1 14 on the first file 
server 110. 

In a preferred embodiment, the system 100 is disposed to perform one of at 
s least four such copying operations: 

0 Volume Copying. The system 100 can be disposed to create an image stream 230 
for copying the file system 1 14 to the destination file system 120. 

10 The image stream 230 comprises a sequence of storage blocks 1 IS from a 

storage image 220. As in nearly all the image transfer techniques described herein, that 
storage image 220 can represent a fiill image or an increm^tal image: 

Full image: The storage blocks US and the storage image 220 represent a 
15 complete and consistent file system 1 14. 

Incremental image: The storage blocks US and the storage image 220 represent 
an incremental set of changes to a consistent file system 114, which when 
combined with that file system 1 14 form a new consistent file system 114. 

20 

The image stream 230 can be copied from the file servo: 110 to the 
destination file system 120 using any communication technique. This could include a 
direct communication link, a LAN (local area network), a WAN (wide area netwoik), 
transfiO' via tape, or a combination tiiereof. When the image stream 230 is transferred 
25 using a network, the storage blocks US are encapsulated in messages using a network 
conununication protocol known to tfie file servo: 110 and to the destination file system 
120. In some networic communication protocols, there can be additional messages 
between the file server 1 10 and to the destination file system 120 to ensure the receipt of 
a complete and correct copy of the image stream 230. 

30 
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The destination file system 120 receives the image stream 230 and 
identifies the storage blocks 115 from the mass storage 113 to be recorded on the 
destination file system 120. 

When the storage blocks 115 represent a complete and consistent file 
system 114, the destination file ^stem 120 records that file system 114 widiout logical 
change. The destination file system 120 can make that file system 1 14 available for read- 
only access by local processes. In alternative embodiments, the destination file system 
120 may make that file system 114 available for access by local processes, without 
making changes by tiiose local processes available to the file server 110 that was the 
source of the file system 1 14. 

When the storage blocks 1 1 5 represent an incremental set of changes to a 
consistent file system 114, the destination file system 120 combines those changes with 
that file system 1 14 form a new consistent file system 114. The destination file system 
120 can make that new file system 1 14 available for read-only access by local processes. 

In embodiments where the destination file system 120 makes the 
transferred file system 114 available for access by local processes, changes to the file 
system 114 at the destination file system 120 can be flushed when a subsequent 
incremental set of changes is received by the destmation file system 120. 

All aspects of the file system 1 14 are included in the image stream 230, 
including file data, file structure hierarchy, and file attributes. File attributes preferably 
include NFS attributes, CIFS attributes, and those snapshots 210 ah^ady maintained in 
the file system 114. 

Disk Copying. In a first preferred embodiment of volume copying (herein 
called "disk copying"), tiie destination file system 120 can include a disk drive or odier 
similar accessible storage device. The system 100 can copy the storage blocks 115 from 
the mass storage 1 13 to that accessible storage device, providing a copy of the file system 
1 14 that can be inspected at the current time. 

16 



wo 00/07104 PCT/US99/17148 

Whoi performing disk copj'ing, the system 100 creates an image stream 
230, and copies the selected storage blocks US from the mass storage 113 at the file 
server 1 10 to corresponding locations on the destination file system 120. Because the 
mass storage 1 13 at the file server 1 10 and the destination file system 120 are both disk 
drives, copying to corresponding locations should be simple and effective. 

It is possible that locations of storage blocks IIS at ttie mass storage 1 13 at 
the file server 110 and at the destination file system 120 do not readily coincide, such as 
if the mass storage 113 and the destination file system 120 have different sizes or 
formatting. In those cases, the destination file system 120 can reordo^ the storage blocks 
lis in the image stream 230, similar to the "^Tape Backup'' embodiment described 
herein. 

Tape Backup. In a second preferred embodiment of volume copying 
(herein called 'tape backup'0» the destination file system 120 can include a tape device or 
other similar long-term storage device. The system 100 can copy storage blocks 115 
fix>m the mass storage 113 to that long-term storage device, providing a backup copy of 
the file system 114 that can be restored at a later time. 

When poforming tape backup, the system 100 creates an image stream 
230, and copies the selected storage blocks US from the mass storage 113 at the file 
server 1 10 to a sequence of new locations on the destination file system 120. Because 
the destination file system 120 includes one or more tape drives, die system 100 creates 
and transmits a table indicating which locations on the mass storage 113 correspond to 
which other locations on the destination file system 120. 

Similar to transfer of an image stream 230 using a network communication 
protocol, the destination file system 120 can add additional information to the image 
stream 230 for recording on tape. This additional information can include tape headers 
and tape gaps, blocking or clustering of storage blocks 115 for recording on tape, and 
reformatting of storage blocks 1 15 for recording on tape. 
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File Backup. In a third preferred embodiment of volume copying (herein 
called "file backup^'X the image stream 230 can be copied to a new file within a file 
system 114, either at the file server 110 or at a file system 114 on the destination file 
system 120. 

5 

Similar to tape backup, the destination file system 120 can add additional 
information to &e ims^e stream 230 for recording in an file. This additional information 
can include file metadata useful for the file system 114 to locate storage blocks US 
within the file. 

10 

o Volume Mirroring. The system 100 can be disposed to create image streams 230 
for copymg the file systrai 114 to the destuiation file system 120 coupled to a 
second file server 1 10 on a Sequent basis, thus providing a mirror copy of the file 
system 114. 

15 

In a preferred embodiment, die mirror copy of the file system 1 14 can be 
used for takeover by a second file server 110 firom the first file server 110, such as for 
example if flie first file server 110 £uls. 

20 When performing volume mirroring, the system 100 first transfers an 

image stream 230 rqpresenting a conqilete file system 1 14 bom the file server 1 10 to the 
destination file systooi 120. The system 100 then periodically transfers image streams 
230 representing incrmiental changes to that file system 114 fiom the file server 110 to 
the destination file system 120. The destination file system 120 is able to reconstmct a 

2S most recent form of the consistent file system 114 firom the initial fiill image stream 230 
and the sequence of incremaital image streams 230. 

It is possible to perform volume mirroring using volume copying of a fiiU 
storage image 230 and a sequoice of incremental storage images 230. However, 
30 determining the storage blocks 1 15 to be included in an incremental storage images 230 
can take substantial time for a relatively large file system 114, if done by logical 
subtraction. 

18 
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As used herein, a "maric-on-allocate storage image" is a subset of a 
snapshot, the member storage blocks being those that have been added to a snapshot that 
originally formed a consistent file system. 

5 In a preferred embodiment, rather than using logical subtraction, as 

described above, at the time the uicremental storage images 230 is about to be 
transferred, the file server 110 maintains a separate "mark-on-allocate" storage image 
230. The maik-on-allocate storage image 230 is constructed by setting a bit for each 
storage block 1 15, as it is added to the consistent file system 1 14. The mark-on-allocate 

10 storage image 230 does not need to be stored on the mass storage 1 13, included in the 
block map, or otherwise backed-up; it can be reconstructed from other storage images 
230 aheady at the file server 1 10. 

When an incremental storage image 230 is transfeired, a first mark-on- 
15 allocate storage image 230 is used to determine which storage blocks 115 to include in 
the storage image 230 for transfer. A second mark-on-allocate storage knage 230 is used 
to record changes to the file system 114 while the transfer is performed. After the 
transfer is performed, the first and second mark-on-allocate storage images 230 exchange 
roles. 

20 

Full Mirroring . In a first preftared embodiment of volume miiroring 
(herein called ^'full mirroring"), the destination file system 120 includes a disk drive or 
other similar accessible storage device. 

25 Upon the initial transfo: of the full storage image 230 from the file server 

110, the destination file system 120 creates a copy of the consistoit file system 114. 
Upon the sequential transfer of each inoremental storage image 230 from the file server 
110, the destination file system 120 updates its copy of the consistent file system 114. 
The destination file system 120 thus maintains its copy of the file system 1 14 nearly up 

30 to date, and can be inspected at any time. 
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When performing full mirroring, similar to disk copying, the system 100 
creates an image stream 230, and copies the selected storage blocks 115 from the mass 
storage 113 at the file s«ver 110 to corresponding locations on the destination file 
system 120. 

5 

Incremental Mirroring . In a second preferred embodiment of volume 
mirroring (herein called "incremental mirroring"), the destination file system 120 can 
include both (1) a tape device or other relatively slow storage device, and (2) a disk drive 
or other relatively fast storage device. 

10 

As used herein, an "incremental mirror" of a first file system is a base 
storage image from the fixst file system, and at least one incremental storage image from 
the first file system, on two storage media of substantially different types. Thus, a 
complete copy of the first file system can be reconstructed from the two or more objects. 

15 

Upon the initial transfer of the fixU storage image 230 from the file server 
110, the destination file system 120 copies a complete set of storage blocks 115 from the 
mass storage 1 13 to that relatively slow storage device^ Upon the sequential transfer of 
each incremental storage image 230 from the file server 1 10, the destination file system 
20 120 copies incremental sets of storage blocks 115 from the mass storage 113 to the 
relatively fast storage device. Thus, the fidi set of storage blocks IIS plus the 
incremental sets of storage blocks 115 collectively represent an up-to-date file system 
1 14 but do not require an entire duplicate disk drive. 



25 When performing incremental mirroring, for the base storage image 230, 

the system 100 creates an unage stream 230, and copies the selected storage blocks 1 IS 
from the mass storage 1 13 at the file server 1 10 to a set of new locations on the relatively 
slow storage device. The system 100 writes the image stream 230, including storage 
block location information, to the destination file system 120. In a preferred 

30 embodiment, the system 100 uses a tape as an intermediate destination storage medium, 
so that the base storage image 230 can be stored for a substantial period of time without 
having to occupy disk space. 

20 
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For each incremental storage image 230, the system 100 creates a new 
hnage stream 230, and copies the selected storage blocks 115 fix)m the mass storage 1 13 
at the file server 110 to a set of new locations on the accessible storage device. 
Incremental storage images 230 are created continuously and automatically at periodic 
times that are relatively close together. 

The incremental storage images 230 are received at the destination file 
system 120, which unpacks them and records the copied storage blocks U 5 in an 
incremental mirror data structure. As each new incremental storage image 230 is copied, 
copied storage blocks 115 overwrite the equivalent storage blocks 115 from earlier 
incremental storage images 230. In a preferred embodiment, the incremental mirror data 
structure includes a sparse file structure including only those storage blocks 115 that are 
different from the base storage image 230. 

In a pref(m:ed embodiment, the incremental storage images 230 are 
transmitted to the destination file system 120 with a data structure indicating a set of 
storage blocks 115 that were deallocated (that is, removed) from the file system on the 
file server 1 10. In response to this data structure, the destination file system 120 removes 
those indicated storage blocks 115 from its incremental mirror data structure. This 
allows the destination file system 120 to maintain the incremental mirror data structure at 
a size no larger than approximately the actual differences between a current file system at 
the file server 1 10 and the base storage image 230 torn the file server 110. 

Consistency Points. When perfonnmg either fuU nmroring or incremental 
muToring, it can occur that the transfer of a storage image 230 takes longer than tiie time 
needed for the file server 110 to update its consistent file system 114 from a first 
consistency point to a second consistency point. Consistency points are described in 
fiirther detail in the WAFL Disclosures. 

In a preferred embodiment, the file server 1 10 does not attempt to create a 
storage image 230 and to transfer storage blocks 115 for every consistency point. 
Instead, after a transfer of a storage image 230, the file server 1 10 determines the most 
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recent consistency point (or alternatively, determines the next consistency point) as the 
effective next consistency point. The file server 110 uses the effective next consistency 
point to determine any incremental storage image 230 for a next transfer. 

5 o Volume Replication. The destination file system 120 can include a disk drive or 
other accessible storage device. The system 100 can copy stors^e blocks &om the 
mass storage 1 13 to that accessible storage device at a signal from the destination 
file system 120, to provide replicated copies of the file system 114 for updated 
(read-only) use by othor file servers 1 10. 

10 

The file server 110 maintains a set of selected master snapshots 210. A 
master snapshot 210 is a snapshot 210 whose existence can be known by the destination 
file system 120, so that the destination file system 120 can be updated with reference to 
the file system 114 maintained at the file server 110. In a preferred embodiment, each 
15 master snapshot 210 is designated by an operator conunand at the file server 110, and is 
retained for a relatively long time, such as several months or a year. 

In a preferred embodiment, at a minimum, each master snapshot 210 is 
retained until all known destination file systems 120 have been updated past that master 
20 snapshot 210. A master snapshot 210 can be designated as a shadow snapshot 210, but in 
such cases destination file systems 120 are taken off-line during update of the master 
shadow snapshot 210. That is, destination file systems 120 wait for completion of the 
update of that master shadow snapshot 210 befc»:e they are allowed to request an update 
from that mast^ shadow snapshot 210. 

25 

The destination file system 120 generates a message (such as upon 
conunand of an operator or in response to initialization or self-test) that it transmits to the 
file server 1 10, requesting an update of the file system 1 14. The message includes a 
newest master snapshot 210 to which the destination file system 120 has most recently 
30 synchronized. The message can also indicate that ihete is no such newest master 
snapshot 210. 
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The file server 1 10 determines any incremental changes that have occurred 
to the file system 1 14 from the newest master snapshot 210 at the destination file system 
120 to the newest master snapshot 210 at the file server 110. In response to this 
determination^ the file server 110 determines a storage image 230 including storage 
5 blocks 1 15 for transfer to the destination file system 120, so as to update the copy of the 
file system 1 14 at the destination file system 120. 

If there is no such newest master snapshot 210, the system 100 performs 
volume copying for a fiill copy of the file systan 114 represented by the newest master 
10 snapshot 210 at the file server 1 10. Similarly, if the oldest master snapshot 210 at the file 
server 110 is newo: tiiian the newest master snapshot 210 at the destination file system 
120, the system 100 performs volume copying for a fiill copy of the file system 1 14. 

After volume replication, the destination file system 120 updates its most 
IS recent master snapshot 210 to be the most recent master snapshot 210 fix}m the file server 
110. 

Volume replication is well suited to uploading upgrades to a publicly 
accessible database, dociunent, or web site. Those destination file systems 120, such as 

20 mirror sites, can then obtain the uploaded upgrades periodically, when they are 
initialized, or iq>on operator command at the destination file system 120. If the 
destination file systems 120 are not in communication with the file server 1 10 for a 
substantial period of time, when commimication is re-established, the destination file 
systems 120 can perform volume replication witti the file server 110 to obtain a 

25 substantially up-to-date copy of the file system 1 14. 

In a first preferred embodiment of volume replication (herein called 
''simple replication"), the destination file system 120 communicates directly (using a 
direct communication link, a LAN, a WAN, or a combmation thereof) with the file server 
30 1 10. 
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In a second preferred embodiment of volume replication (herein called 
"multiple replication'"), a first destination file system 120 communicates directly (using a 
direct communication link, a LAN, a WAN, or a combination thereof) with a second 
destination file system 120. The second destination file system 120 acts like the file 
s server 1 10 to perform simple replication for the first destination file system 120. 

A sequence of such destination file systems 120 ultimately terminates iti a 
destination file system 120 that conmiunicates directly with the file server 110 and 
performs simple replication. The sequence of destination file systems 120 thus forms a 
10 replication hierarchy, sudi as in a directed graph or a tree of file severs 1 10. 

In alternative embodiments, the system 100 can also perform one or more 
combinations of these techniques. 

15 In a preferred embodiment, the file serv^ 110 can n>aintain a set of 

pointers to snapshots 210, naming those snapshots 210 and having the property that 
references to the pointers are functionally equivalmt to references to the snapshots 210 
themselves. For example, one of the pointers can have a name such as ^^master,'' so that 
the newest master snapshot 210 at the file server 110 can be changed simultaneously for 

20 all destination file systems 120. Thus, all destination file systems 120 can synchronize to 
the same master snapshot 210. 

Shadow Snapshots 

25 The system 100 includes the possibility of designating selected snapshots 

210 as "shadow" snapshots 210. 

As used herein, a "shadow snapshot" is a subset of a snapshot, the member 
storage blocks no longer forming a consistent file system. Thus, at one time the member 
30 storage blocks of the snapshot did form a consistent file system, but at least some of the 
member storage blocks have been removed fi-om that snapshot. 
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A shadow snapshot 210 has the property that the file server 1 10 can reuse 
the storage blocks 115 in the snapshot 210 whenever needed. A shadow snapshot 210 
can be used as the base of an incremental storage image 230. In such cases, storage 
blocks 1 15 might have been removed fcom the shadow snapshot 210 due to reuse by the 
5 file system 1 10. It thus might occur that the incremental storage image 230 resulting 
&om logically subtraction using the shadow snapshot 210 includes stomge blocks 115 
that are not strictly necessary (havmg been removed from the shadow snapshot 210 they 
are not subtracted out). However, all storage blocks 115 necessary for the incremental 
storage image 230 will still be included. 

10 

For regular snapshots 210» the file server 110 do^ not reuse the storage 
blocks 115 in the siu^shot 210 until the snapshot 210 is released. Even if the storage 
blocks 1 15 in the snapshot 210 are no longer part of the active file system, the file server 
1 10 retains them without change. Until released, each regular snapshot 210 preserves a 
1 5 consistent file system 1 14 that can be accessed at a later time. 

However, for shadow snapshots 210, the file server 110 can reuse the 
storage blocks 1 15 in the shadow snapshot 210. When one of those storage blocks 1 15 is 
reused, the file server 110 clears the bit in the shadow snapshot 210 for that storage block 

20 US. Thus, each shadow snapshot 210 represents a set of storage blocks 115 from a 
consistent file system 1 14 that have not been changed in the active file system 1 14 since 
the shadow snapshot 210 was made. Because storage blocks 115 can be reused, the 
shadow snapshot 210 does not retain the property of representing a consistent file system 
114. However, because the file server 110 can reuse those storage blocks 115, the 

25 shadow snapshot 210 does not cai^e any storage blocks 1 IS on the mass storage 1 13 to 
be pomanently occupied. 

Method of Operation 

30 Figure 3 shows a process flow diagram of a method for file system image 

transfer. 
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A metfiod 300 is performed is performed by the file server 1 10 and the 
destination file system 120, and includes a set of flow points and process steps as 
described herein. 

/ / / 

Generality of Operational Technique 

In each of the file system image transfer techniques, the method 300 
performs three operations: 

o Select a storage image 220, m response to a first file system (or a snapshot 
thereof) to have an operation p^ormed ttiereon. 

o Form an image stream 230 in response to the storage image 220. Perform an 
operation on the ims^e stream 230, such as backiq) or restore within the first file 
system, or copying or transfer to a second file system. 

o Reconstruct the first file system (or the snapshot thereof in response to the image 
stream 230. 

As shown herein, each of these steps is quite geneml in its application. 

In the first (selection) step» the storage image 220 selected can be a 
complete file system or can be a subset thereof. The subset can be an increment to the 
complete file system, such as those storage blocks that have been changed, or can be 
another type of subset. The storage image 220 can be selected a smgle time, such as for a 
backup operation, or repeatedly, such as for a mirroring operation. The storage image 
220 can be selected in response to a process at a sending file server or at a receiving file 
server. 

For example, as shown herein, the storage image 220 selected can be for a 
full backup or copying of an entire file system, or can be for incremental backup or 
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incremental mirroring of a file system. The storage image 220 selected can be 
determined by a sending file server, or can be determined in response to a request by a 
receiving file server (or set of receiving file servers). 

In the second (operational) step, the image stream 230 can be selected so as 
to optimize the operation. The image stream 230 can be selected and ordered to optimize 
transfer to differmt types of media, to optimize transfer rate, or to optimize reliability. In 
a preferred embodiment, the image stream 230 is optimized to maximize transfer rate 
from parallel disks in a RAID disk system. 

In the third (reconstruction) step, the image stream 230 can be 
reconstructed into a complete file system, or can be reconstructed into an increment of a 
file system. The reconstruction step can be perfonned immediately or after a delay, can 
be performed in response to the process that initiated the selection step, or can be 
performed independendy in response to other needs. 

/ / / 

Selecting A Storage Image 

In each of the file system image transfer techniques, the method 300 selects 
a storage image 220 to be transferred. 

At a flow point 370, the file serves: 110 is r^dy to select a storage image 
220 for transfer. 

At a step 37U the file server 1 10 forms a logical sum LS of a set of storage 
images 220 Al + A2, thus LS = Al + A2. The logical sum LS can also include any 
plurality of storage images 220, such asAl+A2 + A3 + A4, thus for example LS = Al 
+ A2 + A3+A4. 

At a step 372, the file server 110 determines if the transfer is a fiiU transfer 
or an incremental transfer. If the transfer is incremental, the method 300 continues with 
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the next step. If the transfer is a full transfer, the method 300 continues with the flow 
point 380. 

At a step 373, the file server 110 forms a logical difference LD of the 
5 logical sum LS and a base storage image 220 B, thus LD = LS - The base storage 
image 220 B comprises a snapshot 210. 

At a flow point 380, the file server 1 10 has selected a storage image 230 for 

transfer. 

10 

Volume Copying 

At a flow point 310, the file server 110 is ready to perform a volume 
copying operatioit 

iS 

At a step 31 1, the file server 111 selects a ston^e image 220 for transfer, as 
described with regard to the flow point 370 through the flow point 380. If the volume 
copying operation is a full volume copy, the storage image 220 selected is for a fiill 
transfer. If the volume copying operation is an incremental volume copy, the storage 
20 image 220 selected is for an incremental transfer. 

At a step 3 12, the file server 1 10 determines if the volume is to be copied 
to disk or to tape. 

25 0 If the volume is to be copied to disk, the method 300 continues with the step 3 13. 

o If the volume is to be copied to tape, the method 300 continues with the step 3 14. 

At a step 313, the file servo: 110 creates an image stream 230 for the 
30 selected storage image 220. In a preferred embodiment, the storage blocks 115 in the 
image stream 230 are ordered for transfer to disk. Each storage block 1 IS is associated 
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with a VBN (virtual block number) for identification. The method 300 continues with 
the step 315, 

At a step 314, the file server 1 10 performs the same functions as in the step 
5 313, except that the storage blocks 1 15 in the image stream 230 are ordered for transfer 
to tape. 

At a step 315, the file server 110 copies the unage stream 230 to the 
destuiation file system 120 (disk or tape). 

10 

o If the unage stream 230 is copied to disk, the file server 1 10 preferably places 
each storage block 1 15 in an equivalent position on die target disk(s) as it was on 
the source disk(s), similar to what would happen on retrieval fi^m tape. 

IS In a prefrared embodiment, the file server 1 10 copies the unage stream 230 

to the destination file system 120 using a communication protocol known to both the file 
server 110 and the destination file system 120, such as TCP. As noted h^in, ttie image 
stream 230 used with the communication protocol is similar to the image stream 230 
used for tape backup, but can include additional messages or packets for 

20 acknowledgement or retransmission of data. 

The destination file system 120 presents the image stream 230 directly to a 
restore element, which copies the image stream 230 onto the destination file system 120 
target disk(s) as they w&tc on the source disk(s). Because a consistent file system 1 14 is 
25 copied fi'om the file server 1 10 to the destination file system 120, the storage blocks 1 15 
in the image stream 230 can be used directly as a consistent file system 1 14 when they 
arrive at the destination file system 120. 

The destination file system 120 might have to alter some inter-block 
30 points, responsive to the VBN of each storage block 115, if some or all of the target 
stomge blocks 115 are recorded in different physical locations on disk firom die source 
storage blocks 115. 
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o If the image stream 230 is copied to tape, the file server 110 preferably places 
each storage block 1 IS in a position on the target tape so that it can be retrieved 
by its VBN. When the storage blocks 1 1 5 are eveaitually retrieved fiom tape into 
a disk file server 110, diey are preferably placed in equivalmt positions on the 
5 target disk(s) as they were on the source disk(s). 

The destination file system 120 records the image stream 230 directly onto 
tape, along widi a set of block number information for each storage block 113. The 
destination file system 120 can later retrieve selected storage blocks US fix>m tape and 
10 place them onto a disk file server 110. Because a consistent file systwx 114 is copied 
torn the file server 1 10 to the destination file system 120, the storage blocks 1 IS in the 
image stream 230 can be restored dk^tly to disk when later retrieved &om tape at the 
destination file system 120. 

IS Hie destination file system 120 might have to alter some inter-block 

pointers, responsive to the VBN of each storage block IIS, if some or all of the target 
storage blocks 1 IS are retrieved fix>m tsp^ and recorded in diffident physical locations on 
disk from the source storage blocks 115. The destination file system 120 recorded this 
information in header data that it records onto tape. 

20 

At a flow point 320, the file server 1 10 has completed the volume copying 

operation. 
Volume Mirroring 

25 

At a flow point 330, the file server 110 is reacty to perform a volume 
mirroring operation. 

At a step 33 1, the file server 110 performs a fiill volume copying operation, 
30 as described with regard to the flow point 310 through the flow point 320. The volume 
copying operation is performed for a fixU copy of the file system 114. 
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0 If the function to be performed is Ml mirroring, the file server 110 performs the 
full volume copying operation to disk as the target destination file system 120. 

0 If the function to be performed is incremental mirroring, the file server 110 
5 performs the full volume copying operation to tape as the target destination file 

system 120. 

At a sxsp 332, the file server 110 sets a mirroring timer for incremental 
\q>date for the volume mirroring operation. 

10 

At a step 333, the mirroring timer is hit, and the file sgtvct 1 10 begins the 
incremental update for the volume mirroring operation. 

At a step 334, the file server 1 10 performs an incremental volume copying 
13 operation, as described with regard to the flow point 310 through the flow point 320. 
The volume copying operation is performed for an incremental upgrade of the file system 
114. 

The incremental volume copying operation is performed with disk as the 
20 target destination file system 120. 

o If the initial full volume copying op^ation was performed to disk, the destination 
file system 120 incremoits its copy of the file system 114 to include the 
incremental storage image 220. 

25 

o If the initial full volume copying operation was performed to tape, the destination 
file system 120 records the incremental stomge image 220 and integrates it into an 
incremental mirror data structure, as described above, for possibly later 
incrementing its copy of the file system 114. 

30 
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At a step 335, the file server 110 copies the image stream 230 to the target 
destination file system 120. The method 300 returns to the step 332, at which step the 
file server 1 10 resets the mirroring timer, and the method 300 continues. 

5 When the destination file system 120 receives the image stream 230, it 

records the storage blocks 1 IS in that image stream 230 similar to the process of volume 
copying, as described with regard to the step 3 1 S. 

If the method 300 is halted (by an operator conmiand or otiierwise), the 
10 method 300 completes at the flow point 340. 

At a flow point 340, the file server 110 has completed the volume 
muToring operation. 

1 5 Reintegration of Incremental Mirror 

At a flow point 370, the file sorer 1 10 is ready to restore a file system 
from the base storage image 220 and the incremental mirror data structure. 

20 At a step 371, the file smer 1 10 reads the base storage image 220 into its 

file system. 

At a step 372, the file servo* 110 reads the incremental mirror data structure 
into its file system and uses that data structure to update the base storage image 220. 

25 

At a stqp 373, the file server 1 10 remounts the file system that was updated 
using the incremental mirror data structure. 

At a flow point 380, the file server 1 10 is ready to continue operations with 
30 the file system restored firom the base storage image 220 and the incremental mirror data 
structure. 
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Volume Replication 

At a flow point 350, the file server 110 is ready to perform a volume 
replication operation. 

At a step 351, the destination file system 120 initiates the volume 
replication operation. The destination file system 120 sends an indicator of its newest 
master snapshot 210 to the file server 110, and requests the file server 1 10 to perform the 
volume replication operation. 

At a step 352, the file server 110 determines if it needs to perform a volume 
replication operation to synchronize with a second file server 110. In this case, the 
second file server 110 takes the role of the destination file system 120, and initiates the 
volume replication op^tion with regard to the first file server 1 10. 

At a step 353, the file server 110 determines its newest master snapshot 
210, and its master snapshot 210 conresponding to the master snapshot 210 indicated by 
the destination file system 120. 

o If the file server 110 has at least one master sniq>shot 210 older than the master 
snapshot 210 uidicated by the destination file system 120, it selects the 
corresponding master snapshot 210 as the newest one of those. 

In this case, the method proceeds with tiie step 354. 

0 If the file server 1 10 does not have at least one master sn^shot 210 older than the 
master snapshot 210 indicated by the destination file system 120 (or if the 
destination file system 120 did not indicate any master snapshot 210), it does not 
select any master snapshot 210 as a corresponding master sn^hot. 

In this case, the method proceeds with the step 355. 
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At a step 354, the file server 110 parfonns an incremental volume copying 
operation, responsive to the incremental difference between the selected corresponding 
master snapshot 210, and the newest master snapshot 210 it has available. The method 
300 proceeds with the flow point 360. 

5 

At a step 3SS, the file server 1 10 performs a full volume copying operation, 
responsive to the newest master snapshot 210 it has available. The method 300 proceeds 
with the flow point 360. 

10 At a flow point 360, the file server 110 has completed fhe volxmie 

replication operation. The destination file system 120 updates its master snapshot 210 to 
correspond to the master snapshot 210 that was used to make the file system transfer 
from the file server 110. 

1 5 Technical Appendix 

A technical appendix, titled "WAFL Image Transfer," and having the 
inventors named as authors, forms a part of this specification, and is hereby incorporated 
by reference as if fully set forth herein. 

20 

Alternative Embodiments 

Although preferred embodiments are disclosed herein, many variations are 
possible which remain within the concept, scope, and spirit of the invention, and these 
25 variations would become clear to those skilled in the art after perusal of this application. 
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Claims 

1. A file system, having a plurality of storage blocks, and including a 
plurality of bits associated with each one of said plurality of storage blocks, at least one 

5 of said plurality of bits identifying whether said one storage block was part of said file 
system at a time earlier than a current consistent version of said file system. 

2. A file system as in claim 1, includmg a second one of said plurality 
of bits identifying whether said one storage block was part of said file system at a second 

1 0 time earlier than a current consistent version of said file system 

3. A file system as in claim 2, including an element disposed for 
selecting storage blocks in response to said one bit and said second one bit associated 
with said selected storage blocks. 

15 

4. A file sj^tem as in claim 3, including an element disposed for 
copying said selected storage blocks to a destination. 

5. A file system as in claim 4, wherein said destination includes: a 
20 tape, a disk, a data structure in a second file system, a set of network messages, or a 

destination distributed over a plurality of file systems. 

6. A file system as in claim 1, including an element disposed for 
selecting storage blocks m response to said one bit associated with said selected storage 

25 blocks. 

7. A file system as in claim 6, including an element disposed for 
copying said selected storage blocks to a destination. 

30 8. A fde system as in claim 7, wherein said destination includes: a 

tape, a disk, a data stmcture in a second file system, a set of network messages, or a 

destination distributed over a plurality of file systems. 
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9. A file system having a plurality of storage blocks, said file system 
including a snapshot including a set of member storage blocks selected from said 
plurality, said member storage blocks forming a consistent file system other than an 
active file system; said snapshot being disposed as an object in said file system, wherein 
said file system is responsive to at least one file system request with regard to said 
snapshot. 

10. A file system as in claim 9, including 

a mark-on-allocate image of a set of member storage blocks selected &om 
said plurality, said member storage blocks having been added to said snapshot; and 

a storage image defined in response to said snapshot and said maik-on- 
allocate image, said storage image indicating a set of member storage blocks selected 
from said plurality. 

11. A file system as in claim 10, wherein said storage image is defined 
with regard to a logical sum operation on said snapshot and said mark-on-allocate image. 

12. A file system as in claim 9, including 

a mark-on-deallocate image of a set of member storage blocks selected 
from said plurality, said member storage blocks having been added to said snapshot; and 

a stors^e image defined in response to said snapshot and said mark-on- 
deallocate image, said storage image indicating a set of member storage blocks selected 
from said plurality. 

13. A file system as in clsdm 9, including 

a shadow sn^shot of a set of member storage blocks selected from said 
plurality, said member storage blocks having formed a consistent file system other than 
an active file system, with a set of selected member storage blocks removed from said 
consistent file system; and 

a storage image defined in response to said snapshot and said shadow 
snapshot, said storage image indicating a set of member storage blocks selected from said 
plurality. 
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14. A file system as in claim 9, including an indicator of which ones of 
said member storage blocks have been copied. 



15. A file system as in claim 9, including a plurality of said snapshots; 
5 wherein said plurality of said snapshots are associated with an array of bits, said array 

having one set of bits for substantially each storage block in said plurality of storage 
blocks, said set of bits having at least one bit for each said snapshot. 

16. A file system as m claim 9, wherein said file system can manipulate 
10 said snapshot without having to traverse a hi^^hy of file system objects within said 

snapshot. 

17. A file system as in claim 9, wherein said snapshot includes a data 
structure disposed in a format allowing for a set management operation to be performed 

15 relatively eflBciently. 

18. A file system as in claim 9, wherein said snapshot includes an array 
of bits, said array having one bit for substantially each storage block in said plurality. 

20 19. A file system as in claim 9, including 

a plurality of said snapshots; and 

a storage image determined in response to said plurality of snapshots; 
said storage image defining a second set of member storage blocks selected 
from said plurality. 

25 

20. A data structure as in claim 19, including an indicator of which ones 
of said storage blocks in said ston^e image have been copied. 

21. A file system as in claim 19, wherein said storage im£^e is a result 
30 of a logical sum or difference performed on said set of member storage blocks for said 

snapshot and a set of member storage blocks for a second said snapshot. 
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22. A file system as in claim 19, wherein said storage image is a result 
of a logical sum or difference performed on said set of member storage blocks for said 
snapshot and a set of member storage blocks for a second said storage image. 

s 23. A file system as in claun 19, wherein said storage image is a result 

of a set management operation on said set of member storage blocks for said snapshot. 

24. A file system as in claim 9, wherein said snapshot includes a data 
structure disposed in a format allowing for a set management operation to be performed 

10 in 0(n) time or less, where n is a number of storage blocks in said plurality, without 
reading any contents of said storage blocks in said plurality. 

25. A file system as in claim 24, wherein said set management operation 
is a logical sum or difference. 

15 

26. A file system as in claim 9, wherein said snapshot includes a data 
structure identifying which storage blocks in said plurality are member storage blocks of 
said snapshot. 

20 27. A file system as in claun 26, wherein said data structure uses no 

more than about 1/100^ of an amount of storage required by said storage blocks in said 
plurality. 

28. A file system as in claim 26, wherein said data structure uses no 
23 more dian about four bytes per storage block in said plurality. 

29. A method of operating a file server, said method including steps for 
forming a first snapshot of a first consistent state of said file system at a 

selected time, said first snapshot including an incUcation of a set of storage blocks in said 
30 first consistent state; 
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forming a second snapshot of a second consistent state of said file system, 
said second snapshot including an indication of a set of storage blocks in said second 
consistent state; and 

performing an operation on said first and second snapshots to form a 
5 storage image including an indication of at least some storage blocks in said file system. 

30. A method as in claim 29, wherein said operation includes a logical 
sum or difference. 

3 L A method as in claim 29, wherein 
said operation includes a logical sum or difference; and 
said purpose includes making a copy including or excluding a selected 
range of snapshots. 

32. A method as in claim 29, wherein 
said operation includes a logical sum or difference; and 
said purpose includes copying said storage image to a destination. 

33. A method as in claim 32, wherein said destination includes a tape, a 
20 disk, a data stmcture in a second file system, a set of network messages, or a destination 

distributed over a plurality of file systems. 

34. A method to be performed in a file system, said file system having a 
plurality of storage blocks, said method including stq>s for 

25 defining a storage image of a set of member storage blocks selected from 

said plurality, said storage image being formed in response to a set of member storage 
blocks forming a consistent file system other than an active file system; and 

forming an image stream of a sequence of member storage blocks selected 
from said storage image. 

30 

35. A method as in claim 34, including steps for associating a block 

location with substantially each one of said sequence. 

39 
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36. A method as in claim 34, wherein said operation includes 
reconstructing a file system in response to said image si 



37. A method as in claim 34, wherein said steps for forming are 
performed in response to a selected operation to be performed on said member storage 
blocks, said operation being other than an operation on an active file system 

38. A method as in claim 34, wherem said steps for forming include 
steps for optimizing said sequence of member storage blocks for a file system operation. 

39. A method as in claim 34» wherein said stqis for forming include 
steps for optimizing said sequence of membo: storage blocks for a file system operation 
in a RAID file system. 

40. A method as in claim 34, wherein said steps for forming include 

steps for 

optimizing said sequence of member storage blocks in respcmse to a 
physical location in a storage medium for substantially each said member storage block, 
said storage medium having a plurality of storage elements capable of bemg read in 
parallel; and 

ordering said sequence of member storage blocks so that said member 
storage blocks can be substantially optimally read in parallel from said plurality of 
storage elements. 

41. A method as in claim 34, wherein said storage image represents a 
complete file system, 

42. A method as in claim 34, wherein said storage miage represents a set 
of changes to a file system. 

43. A method as in claim 34, including repeating said selecting step at 
periodic intervals. 
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44. A method as in claim 34, including repeating said selecting step in 
response to an operator conmiand. 

45. A method as in claim 34, including repeating said selecting step in 
response to a remote device. 

46. An incremental mirror copy of a file system, said incremental copy 
including a base set of storage blocks stored in a first storage medium, and an 
incremental set of storage blocks stored in a second storage medium. 

47. An incremental mirror copy as in claim 46, wherein 

said furst storage medium is substantially slower than said second stori^e 

medium; and 

said incremental set of storage blocks is more recent than a time needed to 
recover said base set of storage blocks. 

48. An incremental mirror copy as in claim 46, wherein said incremental 
set of storage blocks is responsive to a plurality of updates of said file system. 

49. An incremental mirror copy as in claim 46, wherein said incremental 
set of storage blocks is responsive to a continuous sequence of updates of said file 
system, wherein said incremental mirror copy includes a substantially up to date set of 
storage blocks in said file system. 

50. An incremental mirror copy as in claim 46, wherein said incremental 
set of storage blocks is responsive to an indication of a set of storage blocks deallocated 
from said file system. 

5 1 . Apparatus including 

a file system including a pliirality of snapshots thereof, each representing 
an associated consistent state at an associated selected time; and 
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each said snapshot including an indication of a set of storage blocks in said 
associated consistent state, said indication being recorded in at least one storage block in 
said associated consistent state. 

52. Apparatus as in claim 5 1, including a storage image defining at least 
some storage blocks in said file system, said storage image responsive to an operation on 
at least two of said snapshots. 

53. An incremental mirror of a file system having a plurality of storage 
blocks, said incremental mirror including 

a first set of first member storage blocks selected from said plurality, said 
fu^t member storage blocks forming a copy of a first consistent version of said file 
system; and 

a second set of second memb^ storage blocks selected from said plurality, 
said second member storage blocks being responsive to said first consistent version and 
to a second consistent version of said file system, said second set including a set of 
changes between said first and second consistent version; 

said first set being stored in a first storage medium, and said second set 
being stored in a second storage medium of substantially different type; 

wh^eby a complete copy of said file system can be constructed from said 
fu:st set and said second set. 

54. An incremental mirror as in claim 53, wherein said first storage 
medium has much greater storage capacity and is relatively slower than said second 
storage medium. 

55. An incremental mirror copy as in claim 53, wherein said second set 
of member storage blocks is responsive to a plurality of updates of said file system. 

56. An incremental mirror copy as in claim 53, wherein said second set 
of member storage blocks is responsive to a continuous sequence of updates of said file 
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system, wherein said second set of member storage blocks includes a substantially up to 
date set of storage blocks in said file system. 

57. An incremental mirror copy as in claim S3, wherein said second set 
of member storage blocks is responsive to an indication of a set of storage blocks 
deallocated from said file system. 

58. In a file system having a plurality of storage blocks, a data structure 

including 

a first snapshot of a set of member storage blocks selected from said 
plurality, said member storage blocks forming a consistent fde system otho: than an 
active file system; 

said first snapshot being represented as an object in said file system and 
having a set of storage blocks for recording said first snapshot; 

whereby copying said member storage blocks in said first snapshot has the 
property of preserving at least one snapshot recorded in said file ^stem at a time of said 
first snapshot. 

59. A data structure as in claim 58, including 

a second snapshot of a set of member storage blocks selected from said 
plurality, said member storage blocks fonning a consistent fde system other than an 
active file system; 

said second snapshot being represented as an object in said file system and 
having a set of storage blocks for recording said second snapshot; 

whereby copying said member storage blocks in said second snapshot has 
the property of preserving at least one snapshot recorded in said file system at a time of 
said second snapshot. 

60. A data structure as in claim 58, including 

an image stream including a set of storage blocks including both said first 
snapshot and said second snapshot; 
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whereby copying said member storage blocks iu said image stream has the 
property of preserving both said first snapshot and said second snapshot. 

61. In a file system having a pliuality of storage blocks, a data structure 

including 

a snapshot of a set of member storage blocks selected from said plurality, 
said member storage blocks forming a consistent file system other than an active file 
system; 

said snapshot being represented as an object in said file system and having 
a set of storage blocks for recording said snapshot; 

whereby a backup and restore operation on said file system has the 
property of preserving said snapshot within said file system. 

62. In a file system having a plurality of storage blocks, a data structure 

including 

a storage image of a set of member storage blocks selected from said 

plurality; 

said storage image being formed in response to a set of member storage 
blocks forming a consistent file system other than an active file system. 

63 . A data structure as in claim 62» including 

a first storage image indicating a set of monber storage blocks forming a 
consistent file system; and 

a sequence of incremental storage images, each having a predecessor, at 
least one of said predecessors being said first storage image; 

wherein a logical sum of said set of storage images includes at least one 
complete snapshot. 

64. A data structure as in claim 62, including an indicator of which ones 
of said storage blocks in said storage image have been copied. 



44 



wo 00/07104 PCTAJS99/17148 

65. A data structure as in claim 62, wherein said storage image indicates 
a logical difference of two sets of member storage blocks, at least one of said sets 
forming a consistent file system. 

5 66. A data structure as in claim 62, wherein said storage image indicates 

a logical sum of two sets of member storage blocks each collectively forming a 
consistent file system. 

67. A data structure as in claim 62, wherein said storage image indicates 
10 a set of member storage blocks forming a consistent file system. 

68. In a file system having a plurality of storage blocks, a data structure 
including a shadow snapshot of a set of memb^ storage blocks selected from said 
plurality, said member storage blocks havmg formed a consistent file system other than 

IS an active file system, vnih a set of selected member storage blocks removed from said 
consistent file system. 

69. A data structure as in claim 68, wherein said shadow snapshot is 
disposed in a format allowmg for a set management operation to be performed relatively 

20 eflRciently. 

70. A data structure as in claim 68, >^erein said shadow snapshot uses, 
in addition to said member storage blocks, no more than about I/lOO^ of an amount of 
storage required by said storage blocks in said plurality. 

25 

71. A data structure as in claun 68, wherein said shadow snapshot uses, 
in addition to said member storage blocks, no more than about one byte per storage block 
in said plurality. 

30 72. A data structure as in claim 68, wherein said shadow snapshot is 

disposed as a single object in said file system, whereby said file system can manipulate 
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said snapshot without having to traverse a hierarchy of file system objects within said 
snapshot. 

73. A data structure as in claim 68, wherein said removed member 
5 storage blocks are responsive to completion of a processing operation. 

74. A data structure as in claim 73, wherein said processing operation 
includes a file system operation. 

10 7S. A data structure as in claim 73, wherein said processing operation 

includes reuse of said selected member storage blocks by said file system. 

76. A data structure as in claim 68, wherein said shadow snapshot is 
disposed in a format allowing for a set management operation to be performed in 0(n) 

IS time or less, where n is a number of storage blocks m said plurality, without reading any 
contents of said storage blocks in said plurality. 

77. A data structure as in claim 76, wherein said set management 
operation is a logical sum or difference. 

20 

78. In a file system having a plurality of storage blocks, a data structure 
including a maik*on-allocate image of a set of member storage blocks selected fix)m said 
plurality, said member storage blocks having been added to a snapshot that originally 
formed a consistoU file system. 

25 

79. A data structure as in claim 78, wharein said mark-on-allocate 
storage image is disposed as a single object m said file system, whereby said file system 
can manipulate said snapshot without having to traverse a hierarchy of file system 
objects within said snapshot. 

30 
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80. A data structure as in claim 78, wherein said mark-on-aliocate image 
is disposed in a format allowing for a set management operation to be performed 
relatively efficiently. 

81. A data structure as in claim 78, wherein said raark-on-allocate 
storage image uses no more than about 1/100^ of an amount of storage required by said 
storage blocks in said plurality. 

82. A data structure as in claim 78, wherein said mark-on-allocate image 
uses no more than about four bytes per storage block in said plurality. 

83. A data structure as in claim 78, said member storage blocks having 
been selected responsive to completion of a processing operation. 

84. A data structure as in claim 83, wherein said processing operation 
includes a file system operation. 

85. A data structure as in claim 83, wherein said processing operation 
includes reuse of said selected member storage blocks by said file system. 

86. A data structure as in claim 78, wherein said maik-on-allocate image 
is disposed in a format allowing for a set management operation to be performed in 0(n) 
time or less, where n is a number of storage blocks in said plurality, witiiout reading any 
contents of sdxd storage blocks in said plurality. 

87. A data structure as in claim 86, wherein said set management 
operation is a logical sum or difference. 

88. In a file system having a plurality of storage blocks, a data structure 
including a mark-on-deallocate image of a set of member storage blocks selected from 
smd plurality, said member storage blocks having been removed &om a snapshot that 
originally formed a consistent file systeih. 

47 
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89. A data structure as in claim 88, wherein said mark-on-deallocate 
storage image is disposed as a single object in said file system, whereby said file system 
can manipulate said snapshot without having to traverse a hierarchy of file system 
objects within said snapshot. 

5 

90. A data structure as in claim 88, wherein said mark-on-deallocate 
image is disposed in a format allowing for a set management operation to be performed 
relatively efficiently. 

10 91. A data structure as in claim 88, wharein said maik-on-deallocate 

storage image uses no more than about 1/100^ of an amount of storage required by said 
storage blocks in said plurality. 

92. A data stmcture as in claim 88, wherein said mark-on-deallocate 
15 image uses no more than aboxit four bytes per storage block in said plurality. 

93. A data stmcture as in claim 88, wherein said mark-on-deallocate 
image is disposed in a format allowing for a set management operation to be perfomied 
in 0(n) time or less, where n is a number of storage blocks in said plurality, without 

20 reading any contents of said storage blocks in said plurality. 

94. A data structure as in claim 93, wherein said set management 
operation is a logical sum or difference. 
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AMENDED CLAIMS 
[received by the International Bureau on 1 1 January 2000 (1 1 .01 .00); 
original claims 51, 58, 61, 62, and 68 amended ; 
remaining claims unchanged (8 pages)] 

said incremental set of storage blocks is more recent than a time 
needed to recover said base set of storage blocks. 

5 48. An incremental mirror copy as in claim 46, wherein said 

incremental set of storage blocks is responsive to a plurality of updates of said file 
system. 

49. An incremental mirror copy as in claim 46, wherein said 
10 incremental set of storage blocks is responsive to a continuous sequence of updates 

of said file system, wherein said incremental mirror copy includes a substantially 
up to date set of storage blocks in said file system. 

50. An incremental mirror copy as in claim 46, wherein said 
15 incremental set of storage blocks is responsive to an indication of a set of storage 

blocks deallocated from said file system. 

5 1 . Apparatus including 

a file system including a plurality of snapshots thereof, each 
20 representing an associated consistent state at an associated selected time; and 

each said snapshot including an indication of a set of storage blocks 
m said associated consistent state, at least one storage block selected by a bit at a 
row and a colunm bit plane intersection, said indication being recorded in at least 
one storage block in said associated consistent state. 

25 

52. Apparatus as in claim 51, including a storage image defining at 
least some storage blocks in said file system, said storage image responsive to an 
operation on at least two of said snapshots. 

30 S3. An incremental mirror of a file system having a plurality of 

storage blocks, said incremental mirror including 

49 
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a first set of first member storage blocks selected fi*om said plurality, 
said first member storage blocks forming a copy of a first consistent version of said 
file system; and 

a second set of second member storage blocks selected from said 
5 plurality, said second member storage blocks being responsive to said first 
consistent version and to a second consistent version of said file system, said 
second set including a set of changes between said first and second consistent 
version; 

said fu:st set being stored in a first storage medium, and said second 
10 set being stored in a second storage medium of substantially different type; 

whereby a complete copy of said file system can be constructed from 
said first set and said second set. 

54. An incremental mirror as in claim S3, wherein said first 
15 storage medium has much greater storage capacity and is relatively slower than said 

second storage medium. 

55. An incremental mirror copy as in claim 53, wherein said 
second set of member storage blocks is responsive to a plurality of updates of said 

20 file system. 

56. An incremental mirror copy as in claim 53, wherein said 
second set of member storage blocks is responsive to a continuous sequence of 
updates of said file system, wherein said second set of member storage blocks 

25 includes a substantially up to date set of storage blocks m said file system. 

57. An incremental mirror copy as in claim 53, wherein said 
second set of member storage blocks is responsive to an indication of a set of 
storage blocks deallocated firom said file system. 

30 
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58. In a file system having a plurality of storage blocks, a data 
structure including 

a first snapshot of a set of member storage blocks selected from said 
plurality, at least one member storage block selected by a bit at a row and a column 
5 bit plane intersection, said member storage blocks forming a consistent file system 
other than an active file system; 

said first snapshot being represented as an object in said file system 
and having a set of storage blocks for recording said first snapshot; 

viiereby copying said member storage blocks in said first snapshot 
10 has the property of preserving at least one snapshot recorded in said file system at a 
time of said first sn^^shot. 

59. A data structure as in claim 5S, including 

a second snapshot of a set of member storage blocks selected from 
15 said plurality, said member storage blocks forming a consistent file system other 
than an active file system; 

said second snapshot being represented as an object in said file 
system and having a set of storage blocks for recording said second snapshot; 

whereby copying said member storage blocks in said second snapshot 
20 has the property of preserving at least one snapshot recorded in said file system at a 
time of said second snapshot. 

60. A data structure as in claim 58, including 

an image stream including a set of stomge blocks including both said 
25 first snapshot and said second snapshot; 

whereby copying said member storage blocks in said image stream 
has the property of preserving both said first snapshot and said second snapshot. 

61. In a file system having a plurality of storage blocks, a data 
30 structure including 
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a snapshot of a set of member storage blocks selected from said 
plurality, at least one member storage block selected by a bit at a row and a column 
bit plane tntmection, said member storage blocks forming a consistent file system 
other than an active file system; 
5 said snapshot being represented as an object m said file system and 

having a set of storage blocks for recording said snapshot; 

whereby a backup and restore operation on said file system has the 
property of preserving said snapshot within said file system. 

10 62. In a file system having a plurality of storage blocks, a data 

structure including 

a storage image of a set of member storage blocks selected from said 
plurality, at least one member storage block selected by a bit at a row and a colunm 
bit plane intersection; 

15 said storage image being formed in response to a set of member 

storage blocks forming a consistent file system other than an active file system. 

63. A data structure as in claim 62, including 

a first storage image indicating a set of member storage blocks 
20 formmg a consistent file system; and 

a sequence of incremental storage images, each havmg a predecessor, 
at least one of said predecessors being said first storage image; 

wherein a logical sum of said set of storage images includes at least 
one complete snapshot. 

25 

64. A data structure as in claim 62, including an indicator of which 
ones of said storage blocks in said storage image have been copied. 

65. A data structure as in claim 62, wherein said storage image 
30 indicates a logical difference of two sets of member storage blocks, at least one of 

said sets forming a consistent file system. 

5Z 
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66. A data structure as in claim 62, wherein said storage image 
indicates a logical sum of two sets of member storage blocks each collectively 
forming a consistent fde system. 

5 

67. A data structure as in claim 62, wherein said storage image 
uidicates a set of member storage blocks forming a consistent file system. 

68. In a file system having a plurality of storage blocks, a data 
10 structure including a shadow snapshot of a set of member storage blocks selected 

firom said plurality, at least one member storage block selected by a bit at a row and 
a column bit plane intersection, said member storage blocks having formed a 
consistent file s)^tem other than an active file system, widi a set of selected 
member storage blocks removed fix>m said consistent file system. 

15 

69. A data structure as in claim 68, wherein said shadow snapshot 
is disposed in a format allowing for a set management operation to be performed 
relatively efficiently. 

20 70. A data structure as in claim 68, wherein said shadow snapshot 

uses, in addition to said member storage blocks, no more than about 1/100^ of an 
amount of storage required by said storage blocks in said plurality. 

71. A data structure as in claim 68, wherein said shadow snapshot 
25 uses, in addition to said member storage blocks, no more than about one byte per 

storage block in said plurality. 

72. A data structure as in claim 68, wherein said shadow snapshot 
is disposed as a single object m said file system, whereby said file system can 

30 manipulate said snapshot without having to traverse a hierarchy of file system 
objects within said snapshot. 
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73. A data structure as in claim 68, wherein said removed member 
storage blocks are responsive to completion of a processing operation. 

5 74. A data structure as in claim 73, wherein said processing 

operation includes a file system operation. 

75. A data structure as in claim 73, wherein said processing 
operation includes reuse of said selected member storage blocks by said file 
10 system. 



76. A data structure as in claim 68» wherein said shadow snapshot 
is disposed in a format allowing for a set management operation to be performed in 
0(n) time or less, where n is a number of storage blocks in said plurality, without 

15 reading any contents of said storage blocks in said plurality. 

77. A data structure as in claim 76, wherein said set management 
operation is a logical sum or difference. 



20 78. In a file system having a plurality of storage blocks, a data 

structure including a mark-on-allocate image of a set of member storage blocks 
selected firom said plurality, said member storage blocks having been added to a 
snapshot that originally formed a consistent file system. 

25 79. A data structure as in claim 78, wherein said maik-on-allocate 

storage image is disposed as a single object in said file system, whereby said file 
system can manipulate said snapshot without having to traverse a hierarchy of file 
system objects within said snapshot. 
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80. A data structure as in claim 78, wherein said mark-on-allocate 
image is disposed in a format allowing for a set management operation to be 
perfonued relatively efficiently. 

s 81. A data structure as in claim 78, wherein said ma±-on-allocate 

storage image uses no more than about 1/100* of an amount of storage required by 
said storage blocks in said plurality. 

82. A data structure as in claim 78, wherein said mark-on*allocate 
10 image uses no more than about four bytes per storage block in said plurality. 

83. A data structure as in claim 78, said member storage blocks 
having been selected responsive to completion of a processing operation. 

15 84. A data structure as in claim 83, wherein said processing 

operation includes a file system operation. 

85. A data structure as in claim 83, wherein said processing 
operation includes reuse of said selected member storage blocks by said file 

20 system. 

86. A data structure as in claim 78, wherein said mark-on-allocate 
image is disposed in a format allowing for a set management operation to be 
performed in 0(n) time or less, where n is a number of storage blocks in said 

25 plurality, without reading any contents of said storage blocks in said plurality. 

87. A data structure as m claim 86, wherein said set management 
operation is a logical sum or difference. 



30 88. In a file system having a plurality of storage blocks, a data 

stmcture including a mark-on-deallocate image of a set of member storage blocks 
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selected from said plurality, said member storage blocks having been removed from 
a snapshot that originally formed a consistent file system. 

89. A data structure as in claim 88, wherein said mark-on- 
deallocate storage image is disposed as a single object in said file system, whereby 

5 said file system can manipulate said snapshot without having to traverse a hierarchy 
of file system objects within said snapshot. 

90. A data structure as in claim 88, wherein said mark-on- 
deallocate image is disposed in a format allowing for a set management operation 

10 to be performed relatively efficiently. 

91. A data structure as in claim 88, wiierein said mark-on- 
deallocate storage image uses no more than about 1/100^ of an amount of storage 
required by said storage blocks in said plurality. 

15 

92. A data structure as in claim 88, wherein said mark-on- 
deallocate image uses no more than about four bytes per storage block in said 
plurality. 

20 93. A data structure as in claim 88, wherein said mark-on- 

deallocate image is disposed in a format allowing for a set management operation 
to be performed in 0(n) time or less, where n is a numbo: of storage blocks in said 
plurality, without reading any contents of said storage blocks in said plurality. 

25 94. A data structure as in claim 93, wherein said s^ management 

operation is a logical sum or difference. 
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STATEMENT UNDER ARTICLE 19(H 

Independent claim 51, 58, 61, 62, and 68 are amended to include a phrase 
for selecting a storage block or a member storage block by "a bit at a row and a 
column bit plane intersection." This amendment will have no impact on the 
description or the drawings, since page 8, lines 1 4 to 20 describe this feature. The 
amended Independent claims 51 , 58, 61 , 62, and 68 and the corresponding 
dependent claims 52 to 57, 59, 60, 63 to 67, and 69 to 77 which incorporate the 
corresponding amendments are thereby further distinguished from EP 0 566 967 A 
(INTERNATIONAL BUSINESS MACHINES) 27 October 1993 (1993-10-27) column 
5, line 57 to column 9, line 8. 

This reference was cited as a document of particular relevance. This 
reference "particularly relates to providing a backup session secured to a single one 
of a plurality of accessing data processing systems," as it recites in column 1 , lines 
9 to 12. The reference further states in column 6, lines 29 to 34. the time zero 
backup process is vulnerable in a manner very similar to to that of backup systems 
in the prior art. That is. all backup operations must be rerun if the process 
terminates abnormally prior to completion." 

However, the present invention is more flexible in backup capability and 
directed to another invention, including a system and method for duplicating all or 
part of a file system while maintaining consistent copies of the file system by 
maintaining a set of snapshots, each snapshot indicating a set of storage blocks 
making up a consistent copy of the file system as it was at a known time. Each 
snapshot can be optionally used for other purposes, such as duplicating or 
transferring a backup copy of the file system to a destination storage medium, or 
even manipulated to identify sets of storage blocks in the file system for incremental 
backup or copying. 
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Independent claims 1.9. 29. 34. 46, 78, 88 and the corresponding dependent 
claims 2 to 8, 10 to 28, 30 to 33, 35 to 45. 47 to 50, 79 to 87, and 89 to 94 are 
unchanged. These unchanged claims are believed to include an inventive step. 
They include similar limitations. 
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